AITopics | target q-learning

Collaborating Authors

target q-learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Note on Target Q-learning For Solving Finite MDPs with A Generative Oracle

Li, Ziniu, Xu, Tian, Yu, Yang

arXiv.org Machine LearningMar-22-2022

Q-learning is one of the most simple yet popular algorithms in the reinforcement learning (RL) community [Sutton and Barto, 2018]. However, Q-learning suffers the divergence issue when (linear) function approximation is applied [Baird, 1995, Tsitsiklis and Van Roy, 1997]. To address this instability issue, a technique called target network is proposed in the famous DQN algorithm [Mnih et al., 2015]. In particular, DQN implements a duplication of the main Q-network (i.e., the so-called target network), which is further used to generate the bootstrap signal for updates. One important feature is that the target network is fixed over intervals. Unlike Q-learning, the learning targets do not change during an interval for DQN. In [Mnih et al., 2015, Table 3], it is reported that the target network contributes a lot to the superior performance of DQN.

q-learning, sample complexity, target q-learning, (10 more...)

arXiv.org Machine Learning

2203.11489

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback